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As internet of things (oT) devices are increasing since the emergence of these 
devices in 2010, the data stored by these devices should have a proper security 
measure so that it can be stored without getting in hands of an attacker. The 
data stored has to be analyzed whether the data is safe or malicious, as the 
malicious data can corrupt the whole information. The security model in big 


data has many challenges such as vulnerability to fake data generation, 
troubles with cryptographic protection, and absent security audits. As cyber- 
Keywords: attacks are increasing the main objective of each organization is to secure the 
data efficiently. This paper presents a model of reputation security for the 
detection of biased attacks on big data. The proposed model provides various 


Cyber attack 


loT . evaluation models to identify biased attack in malicious IoT devices and 
Malicious provide a secure communication metric for big data. The results show better 
RBAD rates in terms of attack detection rate, attack detection failure rata, system 
Security throughput and number of dead nodes when the attack rate is increased when 
compared with the existing reputation-based security (ERS) model. Moreover, 
this model reputation-based biased attack detection (RBAD) increases the 
security of the IoT devices in the big data and reduces the biased attack 
coming from various malicious nodes. 
This is an open access article under the CC BY-SA license. 
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1. INTRODUCTION 

Internet of things (IoT) [1] is widely used in our day-to-day life and is used in different areas having 
different applications. IoT applications can range from small sensors to large business applications. Using the 
sensors [2] and different IloT devices. Moreover, the security in big data is very challenging as it is concerned 
with attacks [3] that can originate either from online or offline spheres, hence, we can collect data and store 
the data using different protocols such as hypertext transfer protocol (HTTP) [4], message queue telemetry 
transport protocol (MQTT) [5], and constrained application protocol (CoAP) [6]. The data is stored so it can 
be utilized for the improvement of the device. Once the data has been stored it has to analyzed or has to be 
processed so it can be classified using an attribute. For the success of the IoT applications security, privacy, 
and trust play an important role. Big data [7], [8] plays a big role in IoT applications as it analyzes data 
systematically by extracting the data from the datasets. The attacks include theft of the data that is stored online, 
ransomware, or distributed denial of service (DDoS) attacks [9] that can crash the whole server. This issue can 
even be bad for the companies where the stored data is confidential or even sensitive, such as the details of the 
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customer, numbers of the credit card, or even the details of the contact of the customers. These attacks lead to 
the organization's big loss and can cause financial problems. There are several ways through which the security 
measures in big data can be used to protect the data. One of the most common security measures is to encrypt 
the data while transmitting the data from the sender to the receiver on online platforms. Encrypted data is 
useless to hackers if don’t have a proper encryption key to decode the data. Therefore, encryption is one of the 
methods using which the data can be protected from attackers and the data is completely protected in this 
process. Another method of security in big data is to create trust between the sender and the receiver. Many 
methods such as data mining [10], supervised [11] and unsupervised [12] machine learning algorithms have 
been used for the trust in data quality. Trust plays an important role in the security [12]-[14]. Two ways using 
which the trust method can be applied to the big data. The first method is to first trust the receiver whether 
he/she can handle the data without getting leaked, the second method is to make a trust between the sender and 
the receiver so that when the sender sends the data it should maintain a quality. After seeing all the above 
problems, we proposed a model which provides an evaluation model to identify biased attack in malicious IoT 
devices and provide a secure communication metric for big data. The organization of the paper has been 
formatted according to the following. In section 2 few existing systems along with the benefits and drawbacks 
have been given. In section 3 the proposed methodology has been explained. In this Section, the detail of all 
the evaluation methods along with the final reputation metric and communication metric has been explained. 
In section 4, the results obtained by the proposed model are explained. In Section 5, the conclusion and future 
work of the whole research work has been given. 


2. LITERATURE SURVEY 

Najib et al. [15], a survey on trust design models for loT machines has been done. This survey gives 
an idea about the existing systems and methods that are currently being used for the trust models. It gives a 
classification of the trust models using five operations that includes the trust algorithm, trust metric, trust 
propagation, trust source, and trust architecture. Xiao et al. [16], a trust model using the blockchain has been 
used for mobile edge computing (MEC) which prevents different kinds of attacks. This model calculates the 
execution of the edge devices and sends that information to the nearby edge devices. It uses a method where it 
selects a miner of blockchain, which applies different protocols to get a recording of the block. They have 
proposed a reinforcement learning algorithm that improves the model efficiency. The security of the model is 
also calculated and the efficiency of the edge utility is provided. Wang et al. [17], they have addressed the 
biased load attack in smart grids using a feature selection model which selects the interval state of the node and 
eliminates all the unnecessary nodes which provide less threshold. Also, a matrix has been used to detect the 
attack in the sensors. Debe ef al. [18], this model was built for the Ethereum blockchain and other technologies 
to keep the trust between the public fog nodes and IoT devices. The model is evaluated using performance, 
security, and cost. This model helps to keep the trust as long as possible as compared to the existing trust 
models. Alshammaria and Alsubhi [13], they have proposed a trust model to identify biased attack using 
malicious nodes. In this model they have provided security to the services using trust algorithms that can 
identify the nodes. The results show that this model provides good accuracy while detecting the malicious 
biased attacks. Ghafoorian et al. [19], it describes a model which is based on role-based access control (RBAC), 
which can prevent any security threat and has less execution time compared to the RBAC. This paper explains 
the security methods required for the trust-based system. This model is evaluated using the Advogato dataset. 
Liu et al. [20], a method of blockchain-authorized group-validation method is proposed for the vehicles with 
distributed identification based on the secret sharing and dynamic proxy mechanisms. The validation values 
are used for the combined authentication. The node with a higher-reputation which does the edge computing 
is stored in the blockchain and uploads the final values of the authentication to the server. Zhang et al. [21], a 
scheme has been proposed based on the probabilistic skyline computation technique. This scheme first assigns 
a trust score to each individual based on the performance without showing the information to the other node, 
and then it selects the subset of the reliable individuals for a particular work. This scheme helps to preserve the 
individual’s identity and it also gives a higher efficiency during the extensive simulations. Yuan and Li [22], a 
mechanism for the IoT edge devices which are based on multi-source feedback information fusion has been 
proposed. Two mechanisms have been used in this paper; the multi-score feedback mechanism and the 
lightweight trust evaluating mechanism. An algorithm named feedback information fusion is used which 
overcomes the traditional trust schemes. This mechanism provides higher reliability and efficiency. Nguyen 
et al. [23], it proposes a trust-worthy access control using the blockchain. This blockchain protects the clouds 
from illegal offloading. To protect this offloading from various attacks a deep reinforcement learning (DRL) 
algorithm has been used using the Q-network. Xu ef al. [24], another offloading method using blockchain- 
based secure computation has been proposed. It uses the DRL algorithm and for the management of the trust, 
short-term trust variability and long-term reputation are considered. For a more complete reputation, a three- 
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valued subjective logic (3VSL) is used. To store the data, validate, and update different kinds of blockchains 
have been used. To implement intelligent and secure computation in vehicular cloud networks, the DRL 
algorithm has been used. Mohammadi ef al. [25], they have reviewed various exiting trust based IoT 
recommended techniques. In Figure 1, an example of how the trust has been evaluated according to the [25] 
has been given. 
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Figure 1. Example of a suggestion based on trust oT modeling [25] 
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3. METHOD 

In the Figure 2, the proposed architecture has been given. In this architecture there are various 
evaluation methods which have been used to detect the biased attack. In this model, first the input is passed 
where the data contains malicious nodes which are constantly oscillating from one node to another node. The 
input goes through various evaluation models which will check which node has more reputation and after each 
evaluation it goes to the next evaluation process. In the final step we provide a secure communication metric 
for big data which will provide a reputed node as a final output without any malicious nodes. Further, each 
evaluation method has been given below in the sub sections. 
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3.1. Response and reputation evaluation 

In this part, we evaluate the model using the response and reputation in the cluster-based wireless 
sensor networks (WSN). In this method, we first compute the reputation level of the WSNs using the cluster 
head. We validate that the sensor keeps the information of the trust level of the entire session when a connection 
between a software device to another software device is made. To minimize the cost of the storage between 
the IoT devices and to allocate the weights to a specific session, we use the exponential mean update process 
to store the result of the trust level. To describe the overall process of the reputation level, let Sec? (x, y) where 
x and y are the different IoT devices having o communication in the u session period. The overall process 
can be described using the following (1): 


Seco (% y) = y * Se€Crec(x,y) + (1 — y) * Seco_1(& y) (1) 


Where the Sec;e, depicts the reputation level parameter of the most current operation. The response- 
based security model, in which the sensor gives the response about the quality of experience with the other IoT 
devices is given using the following (2): 


SObea ce { 0, if the interaction is completely untrustable (2) 
1, if the interaction is completely trustable, 
in (1), the y is improved using the variance 
yew de ee ° 
the accumulated result is calculated using the following (4): 
Boy) =d* uo y) + A — d) * Bo_i@y) (4) 


in (4), d is a constant which shows how the sensor behaves if there is any failure in u(x, y). The w3(x, y) is 
calculated using the following (5): 


Uo (xy) = |Seco_1(%& y) — SeCrec| (5) 


Therefore, the model provides less reputation to the current variance than the previously collected 
variance if d is decreased. Moreover, it can be seen that as the u(x, y) increases the y also increases. From 
this, we can say that the significance of the higher threshold is given to the current response. The W in (3) 
shows the reputation weight which is the threshold of the value that helps to prevent y to become a fixed 
parameter. The variation in the reputation response in the different IoT devices x and y is found by 
communicating the sensor device of x and y given using the (6): 


pen (Secl(x,p)—Secl(y,p))” 


Voy) = GH 


(6) 


in (6) the p depicts the common IoT devices, H (x, y) shows the interaction between the common sensor device 
with the sensor device x and y. To establish a connection between the sensor device x and (Re (x, y)), it first 
compares V¥ (x, y) with the connection J. The connection is then updated using the following (7): 


Rb_i@y) + Ae) if VE(x,y) <H, 


_pu 
R3_s (xy) - ee 


Roy) = (7) 


, else, 


in (7), the X is used to show the award parameter, and Y is used to show the penalty parameter, where both of 
these parameters can be changed in the dynamic environment based on the security requirement. 


3.2. Response reputation evaluation 


In this part, the evaluation of the response from the reputation is done using the response data. In the 
current models of trust-based security for the WSNs, the trust data that is given using a good sensor device is 
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assumed to be true and if the response is given from a malicious sensor device, then it is given as false. Since 
there may be a chance where the good sensor device can send the wrong trust data and the malicious sensor 
device may send correct data to hide the malicious behavior of the sensor device. Therefore, it is very important 
to create a method where the response reputation can be evaluated. To compute the trust level, the response 
given using the sensor device with good performance or the data having the quality that can be trustworthy can 
be trusted and safe is used. Hence, more weight is given to a sensor device that has high reliability and less 
weight is given to the sensor device which has low reliability. Assume F(x, y) explains the response trust of 
the sensor device y from the sensor device x, then the value can be evaluated using the following (8): 


__ log(Secé(%Y)) se mu 
Fu (x, y) = log 0 , if Ro 
0, else 


(x,y) > 6, (8) 


in (8), the log 8 shows that the parameter is least tolerable. 


3.3. Explicit reputation evaluation 

In this part, explicit reputation evaluation is done using the intra-cluster communication and inter- 
cluster communication used in a different kind of WSNs. Let L} (x, y) show the explicit reputation data that the 
sensor device x has on sensor device y with the interaction o in the u' session. Then the explicit reputation is 
calculated using the following (9): 


Loy) = Seco(x y) (9) 


by using (9), if sensor device y gives a better performance, then the sensor device x can b considered as a 
reputation parameter. This helps the sensor device y to have trust in the sensor device x. 


3.4. Implicit reputation evaluation 

In this part, the implicit reputation evaluation is done using the intra-cluster communication and inter- 
cluster communication used in a different kind of WSNs. To achieve secure transmission of data from the 
sensors, the sensor device demands the other sensor to send the response data of the sensor device which is 
working with the specific sensor device. The sensor device then collects the response from the different sensor 
devices for calculating the implicit reputation using the following equation. In (10), Z = S(y) shows the sensor 
device set which has already been communicated to the sensor device y. 


— 3 FG (Pp) LG (2, ; 
Gien=\ a gnen 7 elo 
pene Pe AE a 0. 


(10) 


3.5. Recent reputation evaluation 

In this part, the recent reputation evaluation is done using the intra-cluster communication and inter- 
class cluster communication used in a different kind of WSNs. The reputation parameter can be calculated 
using the explicit and implicit reputation metrics. The explicit reputation is given a higher reputation than the 
sensor device which does the computing has more interaction with the target sensor device. Assume Ci(x, y) 
is the reputation parameter, such that the sensor device x has trust in the sensor device y, then: 


Co(%y) = 6 * Loy) + 1 — 6) * Go y) (11) 
in (11), the 6 gives the weight of the reputation parameter which is calculated using the (12): 


TY (wy) 


~ THx,y)+ TY (x,y) (12) 


in (12), the T“(x, y) is used to show the number of times the sensor device x has communicated with the sensor 
device y in the u" session. The T“(x, y) is used to show the mean size of the communication that has been 


calculated when the sensor x and sensor y have a reputable connection. The T (x, y) is calculated using the 
(13): 


Lpez—(x} Fo (x,p)*T" (p,y) 


|Z—{}] 13) 


TCs) = 
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3.6. Past experience reputation evaluation 

In this part, the past experience reputation evaluation is done using the intra-cluster communication 
and inter-class cluster communication used in a different kind of WSNs. As time flows, the current reputation 
parameter will change to the old reputation parameter. Like the reputation parameter is calculated, the 
experience parameter is calculated using the exponentially averaging update method. Let L¥(x, y) gives the 
past reputation parameter that the sensor device x has on the sensor device y: 


*¢ uU U 
LY (x,y) = 2 moan yay), (14) 


in (14), the (0 < y < 1) is the award parameter and L$(x,y) = 0. From the past experience reputation 
parameter, the present malicious IoT devices communicating with the sensor device cannot give the result by 
looking at the past performance. For the current sensor device which is ideal, it has to communicate in a proper 
manner having a significant number of connections so that the reputation parameter can replace the past 
reputation parameter. 


3.7. Anticipated reputation evaluation 

In this part, the anticipated reputation evaluation is done using the intra-cluster communication and 
inter-class cluster communication used in a different kind of WSNs. To evaluate the anticipated reputation 
parameter, both the current reputation parameter and the past experience reputation parameter are used. Let 
F}' (x, y) defines the anticipated reputation parameter that the sensor device y has on the sensor device x. We 
can calculate the anticipated reputation using the (15). 


0,if neither Lor C is available 


aci(~,y) + 1 —- a)Li(, y) if eitherL or C is available (19) 


Fy'(x%,y) = { 


Here, a@ is set to zero initially. Though, it can be changed according to the dynamic environment using 
the w which is the deviation factor as shown as show in (16). 


at+0.1,if Ch(x,y) -Li@,y) >a, 
a=4,a-0.1,if My) -—Li@,y) < -a, (16) 
aif —w< Ch(x~,y) -Li@y) <a. 


Using the w we can make it work in the dynamic environment, which will tell us how fast can the sensor device 
can come out from the past reputation parameter. The value of w must be set to a very small value as the 
malicious sensor device can use this parameter to come out from their earlier malicious performance. 


3.8. Unfair and oscillating reputation evaluation 

In this part, unfair and oscillating reputation evaluation is done using the intra-cluster communication 
and inter-class cluster communication used in a different kind of WSNs. The biased attack can come from a 
malicious IoT device. The malicious IoT devices may purposefully vary between the states so that they can 
affect the reputation parameter which affects the overall performance of the network. We try to accumulate the 
unfair and oscillating reputation parameter to calculate the oscillation using the (17): 


Dy_1(x, y) + POP C3 y) — L3G y) >t 


Do(%,¥) = ) pe_, (x,y) + LY(x,y) — C4(x,y), ifC2(x,y) — LY, y) > —t (17) 
D3_, (x,y), otherwise, 


in (17), t defines the acceptance parameter of the reputation error in evaluating the reputation parameter. The 
p(p > 1) defines the penalty parameter for the acceptance in the reputation parameter. Hence the unfair and 
oscillating reputation of the sensor device can be calculated using (17) and the equation can be established as 
follows: 


0, if Dex, y) > D 


=, = 7 
Do&y) cos é * Sey) , otherwise, 
2 maxD'¥(xy) 


(18) 
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3.9. Reputation metric for detecting malicious IoT device 

In this part, the secure metric reputation evaluation is done using the intra-cluster communication and 
inter-class cluster communication used in a different kind of WSNs. The security metric for the reputation 
parameter F,'(x,y) is calculated using the anticipated reputation parameter and the oscillating reputation 
parameter and is calculated using (19). 


FE(x, y) = F8(x,y) * D8(x, y) (19) 


From (19), it can be said that the sensor device having the highest anticipated reputation parameter 
with less oscillation reputation parameter results in having less overall reputation results. The sensor device 
which purposefully changes its state between the oscillating reputation parameter will have less reputation 
parameter because of the less oscillating reputation parameter. If a sensor device wants to gain a higher 
accumulated reputation parameter, then it should not show any oscillation of the reputation parameter. Hence 
(19) is used to select the sensor device having a higher reputation parameter which is composed of all the 
security parameter that is required to achieve the effective security method for a different kind of WSN 
application. 


3.10. Secure communication metric for big data collection environment 

In this part, the secure communication metric for the big data collection environment is evaluated. 
From (19), we detect the good sensor device and malicious sensor device. When we start sending the packets 
to a sensor device that is currently ideal, it consumes energy above the sensor device in the cluster network 
having a higher reputation parameter. Therefore, to balance the load between the cluster head, first the 
evaluation of the traffic i.e., consumption of energy above the sensor device is calculated using the (20). 


T"(& y) = T' y) + Ypez-pg Fo p) * T’@, y) (20) 


After the evaluation of the traffic, the selection of the cluster head to transmit the packet is evaluated using the 
(21): 


min Yipez-~ J" (&% Pp) (21) 


suppose, if any new sensor does not have a reputation value, then the probability of the sensor device is 
calculated using the (22). 


Foxy). a 
Py) = TpevFe ap)’ t UpevFo(% P) # 0, 


arbitrarily choose any sensor device, else. 


(22) 


To select the sensor device from V, this model uses (22) having a high reputation parameter which 
has a high probability to get selected. When the reputation parameter is set to zero or is zero then the sensor 
device is selected randomly. The proposed method has a performance higher when compared to the existing 
reputation-based models and is shown in the results and discussions section. 


4. RESULTS AND DISCUSSIONS 

In this section, the results have been evaluated using a simulator. In the simulator, various parameters 
are considered for simulation. Each network parameter has its value which has been set accordingly. The 
Network simulation area considered for the detection of the attack was 100 100m. The number of edge servers 
considered was 4 servers. The number of IoT devices was considered from the range of 500 to 1000 devices. 
As the IoT devices kept increasing the value of the number of malicious IoT devices was set to 10, 20, 30, and 
40 percent. The MAC protocol used in the simulation process was IEEE 802.11b. The ratio propagation of 
each device was set to 6 meters and the sensing range of each device was set to 3 meters. Each sensor device's 
energy was set in the range from 0.05 to 0.2 Joules. The dissipation of the radio energy was set to 50 ni/bit. 
The length of the control packet and data packet was set to 248 bits and 2,000 bits respectively. The 
transmission speed of the IoT device was 100 bits and the bandwidth was set to the value of 1,0000 bits. The 
sensing event of each device was 0.1 seconds. The Idle energy consumption of each IoT device was set to the 
value of 50 ni/bit and the amplification energy was set to 100 pJ/bit/m2. The model has been compared with 
the existing reputation-based security (ERS) models [16], [22]. 
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4.1. Attack detection rate vs Attack rate in sensor devices 

In this section, the experimental results for the detection rate of a malicious attack on the IoT device 
have been given. In Figure 3. it can be seen that the reputation-based biased attack detection (RBAD) detects 
more malicious attacks in the IoT devices when compared with the existing ERS model. In the Figure. 3 the 
RBAD model performs 16%, 29%, 19.5% and 19.48% better than the existing ERS model for 10%, 20%, 30% 
and 40% attack rate respectively. 
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Figure 3. Attack detection rate vs attack rate 


4.2. Attack detection failure rate vs Attack rate in sensor devices 

In this section, the experimental results for the failure of the attack detection rate in the IoT device 
have been given. In Figure 4. The RBAD shows less failure to detect malicious attacks in the IoT device when 
compared with the existing ERS model. In Figure 4. the RBAD model performs 32.4%, 26.2%, 14.28 and 
39.4% better than the existing ERS model for 10%, 20%, 30%, and 40% attack rates respectively. 
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Figure 4. Attack detection failure rate vs attack rate 
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Figure 5. Throughput vs attack rate 


4.3. Performance of the system throughput of the model 

In this section, the experimental results of the system throughput in the IoT devices have been given. 
Figure 5. shows the performance of the throughput of the RBAD model and the ERS model. The RBAD model 
shows a higher throughput when compared with the existing ERS model. The ERS model showed a throughput 
of 0.46, 0.20, 0.16, 0.14 whereas the RBAD model showed a throughput of 0.55, 0.28, 0.20, 0.18 for 10%, 
20%, 30% and 40% attack rate respectively. From the results, it can be clearly stated that our model performs 
better than the existing model. 
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4.4. Dead nodes vs attack rate 

In this section, the experimental results for the dead nodes vs attack rate in the IoT device have been 
done. In Figure 6 The number of dead nodes with an attack rate of 10% in the IoT device is shown. Similarly 
Figures 7-9. show the Number of dead nodes with an attack rate of 20%, 30%, and 40% respectively. In all the 
figures the number of dead nodes concerning the simulation time has been plotted and it can be seen that our 
model reduces the dead nodes concerning the time which provides more security to the model. 


50 Number of dead nodes with attack rate of 10% a5 Number of dead nodes with attack rate of 20% 


RBAD ERS = RBAD ERS 


20 


10 


Number of Dead Nodes 
mn 
ry 

Number of Dead Nodes 


Figure 6. Number of dead nodes with an attack rate Figure 7. Number of dead nodes with an attack rate 
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of 30% of 40% 


5. CONCLUSION 

In this paper, we have presented a model of RBAD to detect the biased attack in big data. In this 
method, the reputation-based method has been proposed. Experiments on the model using the data of the IoT 
devices model shows better performance when compared with the existing ERS model. Various results have 
been discussed about the attack rate, attack failure rate, system throughput of the model, and dead nodes vs 
attack rate. The results have attained an outcome that the RBAD model has performed well when compared 
with the existing ERS model. Hence, the RBAD model provides more security to the IoT devices in the big 
data. 
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